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1. Introduction 

Individuals sort in a variety of fashions. The workplace, the school of one’s child, the 
choice of neighborhood in which to reside, and the selection of a spouse are all important 
arenas in which a choice of peers and access to particular goods and networks is explicitly 
or implicitly made. The aim of this chapter is to review the subset of the literature in 
the rapidly growing field of education and inequality that is primarily concerned with 
how individuals sort and the consequences of this for the accumulation of human capital, 
equity, efficiency and welfare. 

At first blush, sorting may seem like a rather strange lens through which to examine 
education. After all, this field has been primarily concerned with examining issues such 
as the returns to education, the nature of the education production function or, at a 
more macro level, the relationship between education and per-capita output growth,^ 
A bit more thought, though, quickly reveals that sorting is an integral component of 
these questions. Who one goes to school with, who one’s neighbors are, who one 
works with, and who is a member of one’s household, are all likely to be important 
ingredients in determining both the resources devoted to and the returns to human 
capital accumulation. 

It is interesting to note that in all these spheres there is at least some evidence 
indicating that sorting is increasing in the US. Jargowsl^ (1996) , for example, examines 
the changing pattern of residential segregation in the US over the last few decades. 
He finds that although racial and ethnic segregation has stayed fairly constant (with 
some small decline in recent years), segregation by income has increased (for Whites, 
Blacks and Hispanics) in all US metropolitan areas from 1970 to 1990. This increased 
economic segregation, and the fact that schools increasingly track students by ability, 
suggests that there is likely to be increased sorting at the school or classroom level by 
income and ability. Kremer and Maskin (1996) find evidence for the US, Britain, and 

‘For a survey of the education production function literature see Hanushek (1986), for returns to 
education see, for example, Heckman, Layne-Farrar, and Todd (1996), for education and growth, see, 
for example, Benhabib and Spiegel (1994). 
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FVance that there is increased sorting of workers into firms, with some high-tech firms 
(e.g., silicon valley firms) employing predominantly high-skilled workers and low-tech 
firms (e.g., fast-food industry) employing predominantly low-skilled workers. Lastly, 
there is also some indication of greater sorting at the level of household partner (or 
“marital” sorting). Although the correlation between spousal partners in terms of years 
of education has not changed much over the last few decades (see Kremer (1997)), the 
conditional probability of some sociological barriers being crossed-e.g., the probability 
that an individual with only a high-school education will match with another with a 
college education-has decreased, indicating greater household sorting (see Mare (1991)). 

This chapter will examine some of the literature that deals with the intersection of 
sorting, education and inequality. This review is not meant to be exhaustive, but to give 
a flavor of some of the advances in the theory and quantitative evidence. Furthermore, it 
should be noted that there is no overarching theoretical framework in this field. Rather, 
different models are interesting because of how they illuminate some of the particular 
interactions among these variables and others-for example, the role of politics, the 
interaction between private and public schools, or the efficacy of different mechanisms 
(e.g. markets versus tournaments) in solving assignment problems. Thus, rather than 
sketch the contribution of each paper, I have chosen to discuss a few models in depth. 
Furthermore, as a primary concern in this area is the magmtude of different effects, 
wherever possible I focus on the contributions that have attempted to evaluate these. 

The organization of the chapter is as follows. I begin with the topic of residential 
sorting. Local schooling is prevalent in most of the world. This policy easily leads to 
residential sorting and may have important implications for education and inequality, 
particularly in countries like the US in which the fimding of education is also largely 
at the local level. I also use this section to review the theory of sorting. Next, I 
turn to examining sorting at the school level. The papers here are different as they 
are primarily concerned with the interaction of public and private schools and with the 
properties of different mechanisms. Lastly I turn to recent work on household sorting 
and its consequences for education and inequality. 
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2. Sorting into Neighborhoods 



Neighborhocds do not tend to be representative samples of the population of the whole. 
Why is this? Sorting into neighborhoods may occur because of preferences for amenities 
associated with a particular neighborhood (say parks), because of some individuals’ 
desire to live with some types of people or not to live with some others (say ethnic 
groups who wish to live together in order to preserve their culture, or who end up doing 
so as a result of discrimination), and in response to economic incentives. This chapter 
will be primarily concerned with the latter, and in particular with the endogenous 
sorting that occurs in response to economic incentives that arise as a result of education 
policies. 

Primary and secondary education is a good that is provided locally. In industrialized 
coimtries, the overwhelming majority of children attend public schools (in the US a 
bit over 91 percent in 1996 and similar percentages in other coimtries).^ Typically, 
children are required to live in the school’s district to attend school there, making 
a neighborhood’s school quality a primary concern of families in deciding where to 
reside. Furthermore, in most coimtries at least some school funding (usually that used 
to increase spending above some minimum) is provided locally; this is particularly true 
in the US in which only 6.6 percent of fimding is at the federal level, 48 percent is at 
the state level, and 42 percent is at the local level.^ 

Does it matter that education is provided at the local level? How does local provision 
of education affect the accumulation of human capital, its distribution, and efficiency 
in general? What are the dynamic consequences of local provision? How do other 
systems of financing and providing education compare? These are some of the question 
this section will explore. I will start out with a brief overview of the economics of 
sorting, much of which will carry through to the other sections as well. 

2.1. Multi-Community Models: The Economics of Sorting 

Characterizing equilibrium in mcxiels in which heterogeneous individuals can choose 
among a given number of potential residences, and in which these choices in aggregate 

^Digest of Education Statistics (1999). 

^The remaining percentages comes from other miscellaneous sources. These figures are for 1996-’97 
(Digest of Education Statistics (1999)). 
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affect the attributes of the community, is in general a difficult task. Since Westhoff 
(1977), economists working with these often called “multi-community models” have 
tended to impose a single-crossing condition on preferences in order to obtain and char- 
acterize equilibria in which individuals either partially or completely separate out by 
type."* As will be discussed in further detail below, the single-crossing condition also 
has two other very useful implications: (i) It guarantees the existence of a majority 
voting equilibrium over p; (ii) In many models it allows one to get rid of “trivial” equi- 
libria (e.g. one in which all communities are identical) when a local stability condition 
is employed. 

A typical multi-community model consists of a given number of communities, each 
associated with a bimdle (^,p). These bimdles consist of a good or input is provided 
in some quality or quantity q at the community level and of a community level price p 
of some (usually other) good or service. The latter can simply be a price associated 
with residing in the neighborhood, e.g., a local property tax. Thus, we can assess the 
indirect utility of an individual from these residing in a given commimity as V{q,p\ y) 
where y is an attribute of the individual such as income, ability, parental human capital, 
wealth or taste. We will assume throughout that q is “good” in the sense that > 0, 
whereas < 0. 

Individuals choose a community in which to reside. In these models, equilibria in 
which individuals sort into communities along their characteristic y are obtained by 
requiring the slope of indifference curves in {q^p) space. 



dp 

dq 



= 3 



( 2 . 1 ) 



to be everywhere increasing (or decreasing) in y. This implies that indifference curves 
cross only once and that where they do, if (2.1) is increasing in t/, then the slope of the 
curve of an individual with a higher y is greater than one with a lower p (the opposite 



if (2.1) is decreasing in y). 

The assumption of a slope that increases (decreases) in y ensures that if an individual 
with yi prefers the bimdle {qj,Pj) offered by community j to some other bimdle {qk,Pk) 
offered by community A;, and Pj > Pk^ then the same preference ordering over these 

"*111 games of asymmetric information (e.g., signalling models, insurance provsion, etc.), the assump- 
tion of single-crossing indifference curves is used in order to obtain either partial or completely separating 
equilibria. 
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bundles is shared by all individuals with y > yi {y < 2/i). Alternatively if the individual 
with yi prefers {qk,Pk), then conunxinity k will also be preferred to community j by all 
individuals with y <yi [y > yi)- 

Ether an increasing or decreasing slope can be used to obtain separation.^ Hence- 
forth, imless explicitly stated otherwise, I will assume that (2.1) is increasing in y^ i.e.. 



^(tL) V„V,-Vy,V, 

dy 






( 2 . 2 ) 



We shall refer to equilibria in which there is (at least some) separation hy characteristic 
as sorting or stratification. 

Condition (2.2) is very powerful. Independently of the magnitude of the expression, 
the fact that it is positive implies that individuals have an incentive to sort. As we 
shall discuss in the next section, this will be problematic for efficiency since it implies 
that even very small private incentives to sort will lead to a stratified equilibriinn, 
independently of the overall social costs (which may be large) from doing so. 

There are many economic situations in which condition (2.2) arises naturally. Sui> 
pose, for example, that q is the quality of education and that this is determined hy 
either a lump sum or proportional tax p on income. If individuals are, for example, 
heterogeneous in income (so y denotes the income of the individual), then this con- 
dition would imply that higher-income individuals are willing to pay more (either in 
levels or as a proportion of their income, depending on the definition of p) in order to 
obtain a greater quality of education. This can then result in a equilibrium stratified 
along the dimension of income. Alternatively, if the quality of education is determined 
by the mean ability of individuals in the conununity schcx)l, p is the price of housing 
in the conununity, and individuals are heterogeneous in ability y, then (2.2) will be 
met if higher-ability individuals are willing to pay a higher price of housing in order 
to obtain higher quality (mean ability) schooling, allowing the possibility of a stratified 
equilibrium along the ability dimension. 

It is important to note, given the centrality of borrowing constraints in the human 
capital literature, that differential willingness to pay a given price is not the only crite- 

^Note that although either assumption can be used to obtain separation, the economic implications 
are very different, If increasing, then in a stratified equilibrium higher y individuals would obtain a 
higher (g,p) whereas, if decreasing, the high (g,p) bundle would be obtained 1:^ lower y individuals. 



6 




8 



rion that determines whether sorting occurs.^ Suppose, for example, that individuals 
are unable to borrow against future human capital or, less restrictively, that individuals 
with lower income, or lower wealth, or whose parents have a lower education level face 
a higher cost of borrowing. Then even in models in which there is no other incentive to 
sort (e.g., in which the return to human capital is not increasing in parental assets or, 
more generally, in which Vq is not a function of y), there will nonetheless be an incentive 
to sort if the cost of residing in communities with higher g’s (i.e., the effective p that 
individuals face) is decreasing in y. So, for example, if individuals with lower assets 
face a higher effective cost of borrowing (they are charged higher rates of interest by 
banks), then they will be outbid by higher-asset individuals for housing in communities 
with a higher g’s. 

In many variants of multi-community models not only does (2.2) give rise to stratified 
equilibria, but it also implies that all locally stable equilibrium must be stratified.^ In 
particular, the equilibrium in which all communities offer the same bundle, and thus 
each contain a representative slice of the population, is locally unstable.® 

There are many local stability concepts that can be imposed in multi-community 
models. A particularly simple one is to define local stability as the property that the 
relocation of a small mass of individuals from one community to another implies that 
under the new configuration of (g,p) in these communities, the relocated individuals 
would prefer to reside in their original community. More rigorously, an equilibrium 
is locally stable if there exists an e > 0, such that, for all possible combinations of 
measure <5 (0 < <5 < e) of individuals yi € A] (where A] is the set of individuals that 
in equilibrium reside in community j), a switch in residence from community j to k 
implies 

V{qk{6),Pk{S),y) < V{qj{6),pj{6), y) Vy € A: (2.3) 

where {qi{6)yPi{6)) are the new bundles of (q^p) that result in community I = j, k. Thus, 
condition (2.3) requires that, for all individuals who switch residence (the set Ajjk), at 
the new bundles they should still prefer community j. This condition is required to hold 

®For human capital models in which imperfections in credit markets play a central role, see Fem^dez 
and Rogemon (1998), Galor and Zeira (1993), Ljungqvist (1993), and Louiy (1981), among others. 

^In many settings this gives rise to a unique locally stable equihbrium. 

®Note that this zero sorting configuration is always an equihbrium in multicommunity models as no 
single individual has an incentive to move. 
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for all community pairs considered.^ 

To see why the equilibrium with no sorting is rarely locally stable, consider, for 
example, the relocation of a small mass of high y individuals from community j to k. 
In models in which the provision of the local good is decided by majority vote, this will 
tend to make the new community more attractive to the movers (and the old one less 
attractive) since the median voter in community k will now have preferences closer to 
those of the high y individuals whereas the opposite will be true in community j. In 
models in which g' is an increasing function of the mean of y (or an increasing function 
of an increasing function of the mean of y)^ such as when q is spending per student or 
the average of the human capital or ability of parents or students, then again this move 
will make community k more attractive than community j for the high y movers. Thus, 
in all these cases the no-sorting equilibrium will be unstable. 

In several variants of multi-community models existence of an equilibrium (other 
than the unstable one with zero sorting) is not guaranteed. For example, in a model 
in which the community bundle is decided upon by majority vote and voters take com- 
munity composition as given, a locally stable equilibrium may fail to exist. The reason 
for this is that although there will exist (often infinite) sequences of community bundles 
that sort individuals into communities, majority vote need not generate any of these 
sequences. Introducing a good (e.g. housing) whose supply is fixed at the local level 
(so that the entire adjustment is in prices) though will typically give rise to existence,^^ 

Condition (2.2) also has an extremely useful implication for the political econony 
aspect of multi-community models. Suppose that p and q are functions of some other 
variable t to be decided upon by majority vote by the population in the community 
(say a local tax rate). They may also be functions, as well, of the characteristics 
of the (endogenous) population in the community. An implication of (2.2) is that 
independently of whether p and q are “nicely” behaved functions of t, the equilibrium 
outcome of majority vote over t will be the value preferred by the individual whose y is 
median in the community. 

The proof of this is very simple. Consider the (feasible) bundle (^,^ preferred by 

^See, e.g., Ferndndez and Rogerson (1996). If communities have only a fixed number of slots for 
individuals as in models in which the quantity of housing is held fixed, then this definition must be 
amm ended to include the relocation of a corresponding mass of individuals from community k to j. 

^®See Westhoff (1977) and Rose- Ackerman (1979). 

^^See, for example, Nechyba (1997). 
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the median y individual in the community, henceforth denoted y. An implication of 
(2.2) is that any feasible (g,p) bimdle that is greater than (g,^ will be rejected by at 
least 50 percent of the residents in fevor of {q,p)y in particular by all those whose y is 
smaller than y. On the other hand, any feasible bimdle with a (g,p) lower than (g,^ 
will also be rejected by 50 percent of the residents, namely all those with y >y. Thus, 
the bundle preferred by y will be chosen by majority vote.^^ 

It is also important to note that even in the absence of a single-crossing condition, to 
the extent that education is funded in a manner that implies redistribution at the local 
level, wealthier individuals will have an incentive to move away from less wealthy ones. 
This is by itself a powerful force that favors sorting but often requires a mechanism (e.g. 
zoning) to prevent poorer individuals chasing richer individuals in order to enjoy both 
a higher g and a lower p. 

For example, a system of local provision of education funded by a local property 
tax implicitly redistributes from those with more expensive housing to those with less 
expensive housing in the same neighborhood. The extent of redistribution, though, 
can be greatly minimized by zoning regulations that, for example, require minimum lot 
sizes. This will raise the price of living with the wealthy and thus greatly diminish 
the amount of redistribution that occurs in equilibriiun. In several models, to simplify 
matters, it will be assumed that mechanisms such as zoning ensure perfect sorting. 

2.2. The Efficiency of Local Provision of Education 

The simplest way to model the local provision of education is in a Tiebout model 
with (exogenously imposed) perfect sorting. In this model, individuals with different 
incomes yi but with identical preferences over consumption c and quality of education g 
sort themselves into homogeneous commimities. Each community maximizes the utility 
of its own representative individual subject to the individual or commimity budget 
constraint. Let us assume that the quality of education depends only on spending per 
student (i.e., the provision of education exhibits constant returns to scale and there are 
no peer effects). Then, perfect sorting is Pareto efficient. Note that this ^stem is 

^^See Westhoff (1977) and Epple and Romer (1991). Also see Cans and Smart (1996) for a more 
general ordinal version of single crossing and existence of majority vote. 

^^See Fem^indez and Rogerson (1997b) for an analysis which endogenizes zoning, sorting, and the 
provision of education. 
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identical to a purely private system of education provision. 

The model sketched in the paragraph above often guides many people’s intuition 
in the field of education. This is unfortimate as it ignores many issues central to the 
provision of education. In particular, it ignores the fact the education is an investment 
that benefits the child and potentially affects the welfare of others as well. These are 
important considerations as the fact that education is primarily an investment rather 
than a consumption good implies that borrowing constraints may have significant dy- 
namic consequences; the fact that education primarily affects the child’s (rather than 
parental) welfare raises the possibility that parents may not be making the best de- 
cisions for the child; and the potential externalities of an agent’s education raises the 
usual problems for Pareto optimality. 

Below I explore some departures from the assumptions in the basic Tiebout frame- 
work and discuss how they lead to inefficiency of the stratified equilibrium. This will 
make clear a simple pervasive problem associated with sorting, namely that utility- 
maximizing individuals do not take into account the effect of their residence decisions 
on community variables. I start out by discussing the simplest modification to the basic 
Tiebout model-reducing the number of communities relative to types. 

Following Ferndndez and Rogerson (1996), consider an economy with a given number 
of communities j = {l,2,...iV}, each (endogenously) characterized by a proportional 
income tax rate tj and a quality of education Qj equal to per-pupil expenditure, i.e., 
qj z= tjfjLj. Individuals who differ in income yi^ i e I = {1,2,.../} (with yi > V2 > 
... > yj), simultaneously decide in which community, C/, they wish to reside. Once that 
decision is taken, communities choose tax rates via majority vote at the commumty 
level. Individuals then consume their after-tax income and obtain education.^"^ 

Assume for simplicity that individual preferences are characterized by the following 
separable specification: 



u(c) -h v{q) (2.4) 

^'^Very often the literature in this field has implicitly adopted a sequencing such as the one outlined 
above. Making the order of moves explicit as in Ferndndez and Rogerson (1996) allows the properties 
of equilibrium (e.g., local stability) to be studied in a more rigorous fashion. It would also be of interest 
to examine properties of models in which communities act more strategically and take into account 
the effect of their tax rate on the community composition. There is no reason to believe that this 
modification would generate an efficient equilibrium, however. 
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so that the sorting condition (2.2) is satisfied if — > l,Vc. We will henceforth 

assume that the inequality is satisfied, ensuring that individuals with higher income are 
willing to suffer a higher tax rate for higher quality. 

Suppose that the number of communities is smaller than the number of income 
types. In such a case the equilibrium will generally not be Pareto efficient. The 
clearest illustration of this can be given for the case in which individuals have preferences 
such that an increase in the mean income of the community ceteris paribus decreases 
the tax rate that any given individual would like to impose. As the preferred tax 
rate of an individual is given hy equating u\c)yi to v^{q)fij, this is ensured by assuming 
> 1 (note that this is the parallel of the condition on u that generates sorting). 

As discussed previously, the result of majority vote at the community level is the 
preferred tax rate of the median income individual in the community. A few things 
to note about the characteristics of equilibrium. First, in equilibrium no commimity 
will be empty. If one were, then in any community that contained more than one 
income type, those with higher income would be made better off by moving to the 
empty community, imposing their preferred tax rate, and engaging in no redistribution. 
Second, in a locally stable equilibrium commimities cannot offer the same bimdles and 
contain more than one type of individual (as a small measure of those with higher income 
could move to one of the communities, increase mean income there and end up with the 
same or a higher income median voter who has preferences closer to theirs’). Lastly, 
if communities have different qualities of education (as they must if the communities 
are heterogeneous), then a community with a strictly higher q than another must also 
have a strictly higher t (as otherwise no individual would choose to reside in the lower 
quality- higher tax community). 

In the economic environment described above all locally stable equilibria must be 
stratified, i.e., individuals will sort into communities by income. In such equilibria, 

*^Most assumptions here are for simplicity only, e.g., preferences need not be separable and introducing 
housing and property taxation rather than income taxation would allow a sorting equilibrium to be 
characterized by higher-income communities having lower tax rates (but higher tax inclusive prices) and 
higher q. We forego the last option as it simply complicates matters without contributing additional 
insights. 

*®Note that type here is anonymous with income level. Hence the assumption that there are fewer 
neighborhoods than types is a reasonable one to make. 

^^This assumption implies that an increase in the mean income of the community that does not change 
the identity of the median voter will result in a higher g and a lower t ensuring that all residents are 
made better off. 
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communities can be ranked by the quality of education they offer, their income tax 
rate, and the income of the individuals that belong to them. Thus, all stable equilibria 
can be characterized by a ranking of communities such that Vj, qj > ^+i, tj > tj+i, 
and imnyi € Cj > maxyi G Cj+i. 

To facilitate the illustration of inefficiency, assume for simplicity that there are only 
two communities j = 1, 2 and / > 2 types of individuals.^® A stratified equilibrimn 
will have all individuals with income strictly greater than some level yf, living in Ci and 
those with income strictly lower than yf, living in C 2 with Q 2 > qi and t 2 > ti. 

Suppose that in equilibrium individuals with income yt live in both communities. It 
is easy to graph the utility 

^6 = u(3Jb{l - tj)) + v{tjfij) (2.5) 

of these “boundary” individuals as a fimction of the community in which they reside 
and as a function of the fraction of these individuals that reside in Ci. Let p^ denote 
the equilibrium value of the boimdary individuals residing in Ci. Note that a decrease 
in p 5 from its equilibrimn value that does not alter the identity of the median voter in 
either community will make individuals with income yt better off in both communities 
as mean incomes will rise, qualities of education increase, and tax rates fall in both 
communities. Thus in order for this equilibrium to be locally stable, it must be that 
such a decrease makes yt individuals even better off in Ci relative to C 2 , reversing the 
outward flow and reestablishing pi as the equilibrimn. Thus, as shown in Figure 1, the 

curve must cross the curve from above.^^ 

This equilibrimn is clearly inefficient. Consider a marginal subsidy of s > 0 to all 
individuals with income r/6 who choose to reside in C 2 .^ Given that without a subsidy 
these individuals are indifferent between residing in either of the two communities, it 
follows that a subsidy will increase the attractiveness of C 2 relative to Ci . Consequently, 
some yb individuals will move to C 2 , thereby increasing mean income in both commu- 
nities. For a small enough subsidy such that the identity of the median voter does not 

^®See Fern^lndez and Rx)gerson ( 1996 ) for a generalization of this argument to many communities. 

^®Note that we are assuming for the range of shown that neither of the communities’ median voters 
are changing. 

^®If income is unobservable, then a small subsidy to all individuals who reside in C2 would have to 
be paid. 
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change in either community, the overall effect will be to decrease tax rates and increase 
the quality of education in both communities, thus making all individuals better off. 
Thus, it only remains to show that the subsidy can be financed in such a way to retain 
the Pareto improving nature of this policy. A simple way to do so is by (marginally) 
taxing those individuals who remain in This tax will only further increase their 
outflow from C\ to the point where they are once again indifferent between residing in 
both communities. As shown in Figure 1, the tax serves to further increase the utility 
of this income group (and consequently everyone else’s). This last point suggests that 
a simpler way of producing the same Pareto-improving results is a policy that foregoes 
the subsidy and simply taxes any yb individual in C\, This would again induce the 
desired migration and increase mean income in both communities. 

Fernandez and Rogerson (1996) examine these and other interventions in a model 
with many communities. The principle guiding the nature of Pareto-improving policies 
is not affected by the number of communities considered; policies that serve to increase 
mean income in some or in all communities by creating incentives to move relatively 
wealthier individuals into poorer communities will generate Pareto improvements.^^ 

The possibility of Pareto improvements over the decentralized equilibrium in the 
model above arises as a result of individuals not taking into account the effect of their 
residence decisions on community mean income. In the next example, the inefficiency 
of equilibrium results from individual residence decisions not internalizing diminishing 
returns. 

Consider a multi-community model with two communities, C\ and C 2 , and a total 
population (of parents) of iV = 2. Parents differ in their human capital, /li, and 
potentially in their own income yi. To simplify matters we assume that the initial 
distribution is confined to two values h\ and /12 with h\ > /12 and total numbers of 
parents of each type given by n\ and 712 , respectively, such that ni + 712 = 2. 

We assume that each community has a fixed number of residences, N/2 = 1 each 
available at a price Pj, j = 1,2. Let Ai be the fraction of high human-capital parents 
who choose to live in C\ (and thus A 2 = tij — Ai) and let fij be the mean human capital 
of parents that reside in Cj, Thus, /ij(Aj) = \jh\ + (1 — Aj)/i 2 - 

Again, if income is not observable, it is possible to preserve the Pareto improving nature of this 
policy by (marginally) teixing all C\ residents. 

^^The exact specification of these policies, however, depends on the number of communities involved 
in a rather odd fashion as explained in Femdndez (1997). 
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Parents decide in which community to live, pay the price pj of residing there, and 
send their children to the commimity school. Parents care about aggr^ate family 
consumption, which is given by the sum of their own income and the child’s future 
income, /, minus the cost of residing in the community and a lump-sum transfer T. 

The child’s future income is an increasing function of the human capital she acquires. 
This depends on her parent’s human capital and on local human capital q which is 
assumed to be an increasing function of the mean human capital in the neighborhood. 
As the latter is simply a linear fimction of Aj, we denote this function as qj = Q(Aj), 
O' > 0. Thus, 



Iij^F{hi,Q{Xj)) (2.6) 

with F/i, Fq > 0 and where lij indicates the income of a child with a parent of human 
capital hi that resides in neighborhood j. 

Hence, parents choose a community in which to reside that maximizes 



u{yi+Iij -Pi+T) 



(2.7) 



subject to (2.6) and taking pj , T and qj as given. Note that if parental and local human 
capital axe complements in the production of a child’s future income, then (2.7) obeys 
(2.2), and hence individuals will sort.^ Henceforth, I will assume this is the case, that 



— FhqQf > 0 



( 2 . 8 ) 



dh 

Given (2.8), the only locally stable equilibrium is that with maximal sorting. In- 
dividuals with human capital h\ live in Ci, characterized by a higher p and a higher q 
than that in C% individuals with h 2 live in C 2 . If the number of one of these types 
exceeds the space available in a commimity (i.e., 1), then that type is indifferent and 
lives in both communities. Thus, in equilibrium Ai -- min(l, ni ). 

In order to close the model, we need to specify housing prices. Rather than deter- 
mining the price by specifying the microfoundations of the housing market, as in many 

^^See de Bartolome (1990) for a two-community fixed housing stock model in which there is com- 
plementarity between spending on education and ability but in which peer effects matter more for low 
ability students. 
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models in the literature we simply solve for the price differential such that no individ- 
ual would wish to move.^ Depending on whether n\ is greater, smaller or equal to 1, 
there are three different possible configurations as in the first case h\ types must be 
made indifferent (pi —p2 = Vn “ 2 / 12 )) l^he second /12 types must be made indifferent 
(Pi “ P2 = 2/21 “ 2 / 22 ) » whereas in the third each type must be at least as well off in its 
own community than in the other {y[i — 2/12 ^ Pi ^ 2/21 ~ 2/^)* Rather than in- 
clude landlords or define the structure of house ownership by agents, we simply assume, 
as in de Bartolome (1990), that housing rents are rebated to individuals in a lump- 
sum fashion so that each individual receives T = regardless of the community of 
residence. 

Is the decentralized equilibrium efficient? Rather than characterizing Pareto im- 
proving policies, I will confine my discussion here to investigating whether the unique 
locally stable decentralized equilibrium (that with maximum sorting) maximizes pro- 
ductive efficiency. 

The tensions that exist in this model are easy to define. On the one hand, parental 
and local human capital are complements, suggesting that future output is maximized by 
sorting, i.e., efficiency requires concentrating high-human capital parents in the same 
community, precisely what occurs in equilibrium. On the other hand, there is an 
externality to individual residence decisions that is not being taken into account, namely 
potentially decreasing returns to the concentration of high human-capital individuals in 
the same neighborhood. In particular, individuals do not taJce into account whether an 
additional unit of high human capital on the margin increases local hiunan capital more 
in the community with a high or low concentration of hi. Similarly, they do not take 
into account whether a marginal increase in local hiiman capital will add more to total 
output 1^ being allocated to a community with a high or low concentration of hi. 

To see this more formally, consider the total future income Y generated by a com- 
munity given that a fraction A of high human-capital parents live there: 

Wheaton (1977) and de Bartolome (1990). 

^^See Benabou (1993) for a multi-community model in which individuals can acquire high or low skills 
or be unemployed. The costs of acquiring skills are decreasing in the proportion of the community that 
is highly skilled but this decrease is larger for those acquiring high skills. This leads to sorting although 
ex ante all individuals are identical. As in the model discussed here, there will be maximal sorting by (ex 
post) high-skill individuals. The interesting question is this paper is how the decentralized equilibrium 
compares to one with no sorting given that neither is efficient (since in both cases individuals ignore 
the externality of their skill acquisition decision on the costs faced by others). 
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y(A) = XF{hi,Q{X)) + (1 - A)F(/i2,g(A)) 



(2.9) 



Note that if future income is concave in A, then it is maximized by allocating high 
human-capital parents so that they constitute the same proportion in both conununities, 
i.e., Ai = A 2 . If, on the other hand, future income is convex in A, then maximum sorting 
will maximize future income, i.e., as in the decentralized equilibrium Ai = min(l,ni). 

Taking the appropriate derivatives yields: 

Y" = 2[F,(/ii,g(A))-F,(/i2,g(A))] + [AF,(/ii,g(A))+ ( 2 . 10 ) 

(1 - A)F,(/i2,g(A))]g" + [AF„(/ii, g(A)) + (1 - x)F^{h2, g(A))] g'^ 

Let us carefully examine the terms in (2.10). The complementarity of parental 
and local human capital in the production of children’s human capital guarantees that 
the expression in the first square brackets is positive. Thus, this fector pushes in the 
direction of convexity of Y and thus in favor of sorting. Recall from (2.8) that it is 
only on the basis of this factor that sorting occurs in equilibrium. If there is decreasing 
returns to community mean human capital in the formation of local human capital, 
however, i.e., if Q is concave (and thus Q" < 0), then Q" times the expression in 
the second square brackets will be negative, imposing losses from concentrating parents 
with high human-capital in the community. Lastly, there will be an additional loss from 
sorting if there is decreasing returns to local human capital in the production of future 
income, i.e., if Fqq < 0, as this implies that the term in the third square brackets is 
negative. Thus, decreasing returns to community mean human capital in the formation 
of local human capital and decreasing returns to local human capital in the production 
of children’s future income suggest that Y is concave, and hence that efficiency would be 
maximized by having parents with high-human capital distributed in both communities 
in the same proportion.^® 

It is important to recall that maximum sorting will take place as long as F^q is 
positive but otherwise independently of its magnitude. Hence a very small amoimt of 
complementarity (again, the expression in the first square brackets) and private gain 

y is not globally concave nor convex, then the equilibrium will still produce too much sorting 
and some redistributing of high human capital individuals towards a more equal distribution will be 
efficiency enhancing. 
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could easily be swamped by the concavity of F and Q and social loss. 

The model presented above is one in which all sorting is taking place because of 
peer effects-that is, people want to live with individuals with high hinnan capital as it 
increases the earnings of their children. As local hinnan capital and parental human 
capital are complements, high human capital parents outbid others to live in a com- 
munity where the level of local human capital is highest, leading to stratification by 
parental human capital levels. Note that income and the perfection or imperfection of 
capital markets actually played no role in producing the results above.^^ 

The above analysis also suggests that if spending on education E were an additional 
factor in the production of future income but not a factor that individuals sorted on, 
i.e., F{h,Q{X),E{\)) with Fe > 0, ^ > 0 and Fhe = 0, then sorting would occur for 
the same reasons as before, but even a policy of enforced equalization of spending across 
communities would not stop individuals from sorting. 

Unfortunately, there has been very little work done to assess the significance of 
the inefficiencies discussed above. Although much work points, for example, to the 
importance of peer eflfects in learning, whether the appropriate cross-partial is n^ative 
or positive remains in dispute (i.e., we do not even know whether it would be efficient, all 
considerations of diminishing returns aside, for children to sort by aptitude, for example, 
or for them to mix).^® Similarly, we do not know whether quality of education (say 
spending) and parental human capitals are complements. This, to my view, makes 
models in which the main imperfection lies in the functioning of the capital market 
(and sorting on grounds of minimizing redistribution) relatively more attractive.^® 

^^The fact that utility depends only on total net family income and that the latter is not influenced 
by spending aUows us to abstract from issues of borrowing and lending as long as parents have sufficient 
income to tad succesfiilly for housing. 

^®For example, Henderson, Mieszkowski, and Sauvageau (1978) argue for a zero cross partial and 
dimishing returns whereas Summers and Wolfe (1977) for a negative cross partial. 

^^These borrowing constraints may not allow families to borrow to send their child to private school , for 
example. Alternatively, they may not aUow poorer families to borrow to live in (wealthy) neighborhoods 
with higher quality public education. The general failure of these credit markets is that parents are 
unable to borrow against the future human capital of their children. 
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2.3. Comparing Systems of Financing Public Education: Dynamic Consid- 
erations 

The choice of education finance system matters for various reasons. First, and foremost, 
different finance systems tend to imply different levels of redistribution. In economies 
in which there is imperfect access to financing the acquisition of human capital, redis- 
tribution can play an important role in increasing the human capital levels of children 
firom lower-income families. Different finance systems may also may have important 
consequences for who lives where and thus for the identity of a child’s peers and for the 
use of the land market. 

There have been several papers written in this area that examine primarily the static 
consequences of different systems of financing education.^® Fernandez and Rogerson 
(1999b), for example, examine five different education finance systems, and contrast the 
equity and resources devoted to education across these systems assuming that the par 
rameters of the education finance system axe chosen by majority vote. They calibrate 
their benchmark model to US statistics and find that total spending on education may 
differ by as much as 25% across systems. FYirthermore, the trade-off between redis- 
tribution and resources to education is not monotone; total spending on education is 
high in two of the systems that also substantially work to reduce inequality. A polit- 
ical economy approach to the contrast of different education finance systems has also 
been pursued by Silva and Sonstelie (1995) and Fernandez and Rogerson (1999a) who 
attempt to explain the consequences of California’s education finance reform, whereas 
Nechyba (1996) and de Bartolome (1997) both study foundation ^sterns. There is also a 
growing empirical literature devoted to examining how changes in state- level education 
finance systems affect education spending, including Downes and Schoeman, (1998), 
Loeb (1998), Hoxby (1998), Evans, Murphy, and Schwab (1997, 1998), and Manwaring 
and Sheffrin (1997). 

The papers mentioned above, however, are only indirectly concerned with the conse- 
quences of sorting and they are all static models. In this section, by way of contrast, we 
will focus on dynamic consequences of sorting in response to different education finance 
systems. To facilitate the theoretical analysis, we will focus on two extreme systems: a 

^°See Inman (1978) for an early quantitative comparison of education finance ^sterns in the context 
of an explicit model. 
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pure lcx:al ^stem with perfect sorting and a state system with uniform school spending 

O 1 

per student across communities. 

This section presents two models.^^ The first, based on Femdndez and Rogerson 
(1997a, 1998) uses a Tiebout model in which perfect sorting, from a static perspective, is 
efficient. It then examines the trade-off imposed by switching to a state financed ^stem. 
The model is calibrated to US statistics, allowing one to determine whether these trade- 
offs are quantitatively significant. The main trade-off this analysis illustrates is that 
between a system that loosely speaking allows individuals to consume bundles that are 
‘‘right” for them given their income versus a ^stem that imposes a uniform education 
bimdle across heterogeneous individuals, but allows for more efficient use of resources 
from the perspective of future generations. In particular, in an economy in which 
borrowing constraints prevent individuals from financing their education and missing 
insurance markets does not allow children (or parents) to insure against income or ability 
shocks, a state system may result in a more efficient production of next period’s income 
(again in a sense that will be made rigorous below) than in a local ^stem in which the 
possibilities for redistribution are only at the local level. The trade-offs are found to 
be quantitatively significant. 

The second model is based on Benabou (1996). This is a purely theoretical analysis 
that contrasts the short versus long-run consequences of a local compared to a state 
system in which the main trade-off is between human capital being complementary in 
production at the economy wide level but parental human capital and pending on 
education being complementary at the local level. 

The simplest contrast between the dynamic consequences of these two extreme forms 
of education finance-local versus state-can be examined in the familiar Tiebout model 
of perfect sorting in which income is the only source of heterogeneity among individuals. 
This allows us to abstract away from complications that would be introduced hy the 
political economy of tax choice at the local level when individuals are heterogenous, hy 
changes in residence over time with the dynamic evolution of the income distribution, 

^^See Fernandez and Rogerson (forthcoming) for a dynamic analysis of a foundation system. 

Other dynamic analysis of education finance systems include Cooper (1998), Durlauf (1995), Glomm 
£ind Ravikumar (1992), and Saint-Paul and Verdier (1993). 

^^It may be objected that this analysis confounds two things-the amount of redistribution (or in- 
surance) and the system of education. In reality, education always entails some redistribution and 
a multidimensional political economy model would be required to allow one to differentiate between 
redistribution directly through income and through education. 
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by housing (and the inefficiencies that stem from taxing this good), peer effects, or 
simply from diversity in tastes.^"^ Note that by considering a Tiebout system with 
perfect sorting, we can reinterpret what fiDllows as contrasting a purely private system 
of education with a state financed one. 

Following Femdndez and Rogerson (1997a), consider a two-period overlapping gen- 
erations model in which each person belongs to a household consisting of one old in- 
dividual (the parent) and a young one (the child). Parents make all the decisions and 
have identical preferences described by 

U{c,y') = u(c) + Ew{t/) (2.11) 

where 2/ is next period’s income of the household’s child and E is the exp>ectations 
operator. 

In the first period of life, the child attends school and obtains the quality of education 
q determined by her parent’s (equivalently community’s) spending. In the second period, 
the now old child receives a draw from the income distribution. A child’s income when 
old is assumed to depend on the quality of schooling and on an iid shock ^ whose 
distribution 5^(^) is assumed to be independent of q. Thus, 

y' = f{q,0 ( 2 - 12 ) 

Once the adult’s income is determined so is the community of residence as adult. 
The adult (now a parent), then decides how much of her income to consiune and how 
much to spend on her own child’s education. Letting v{q) = f we can 

now write preferences exactly as in equation (2.4). Assiuning that v is well behaved, 
under a local ^stem individuals will set spending on education to equate the marginal 
utility of consiunption with the marginal utility of education quality (i.e., u'{c) = t/ (g)), 
implying a local tax rate r(y) and q = T{y)y. 

We next turn to the determination of spending on education in a state-financed 
system. We assume that all individuals face the same proportional income tax rate 
that is used to finance public education q = Tafj. and that individuals are unable to opt 

^■'See, however, Femtadez and Rogerson (1998) for a more complex dynamic model in which the 
sorting of individuals into communities endogenously evolves over time along with housing prices and 
the housing stock. 
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out of public education to a private system.^. 

The first-order condition for utility maximization now equates the ratio of the mar- 
ginal utility of consmnption and the marginal utility of education quality to the ratio of 
the mean relative to individual income, i.e.,“;^ = . Note that this condition reflects 

the fact that tmder a state financing system, unlike in the local system, the relative 
price of a unit of education (in terms of forgone consumption) is not the same across 
individuals. Lower-income individuals face a lower price than higher-income individu- 
als. In a local finance system, on the other hand, this relative price equals one for all 
individuals. Under majority vote, concavity of u and v imply that the preferences of 
the individual with median income in the population determines the choice of Ts. 

Letting gt{y) be the income distribution of old individuals at the beginning of period 
t, tmder either education finance system an equilibrium at the end of period t generates 
a beginning-of-period income distribution for period t + 1, Qt-^v Let F{g{y)) be the 
income distribution that results in the following period given this period’s distribution 
of g{y). A steady state in this model then consists of an income distribution g* such 
that g*(y) = F{g*{y)). 

Calibrating this simple model involves making choices over the education quality 
technology and preferences. There is a large and controversial literature that surroimd 
the education production function and there is no consensus on the form it should 
take.^ Guided primarily by simplicity, a convenient specification is which 

yields an elasticity of future income with respect to education quality that is constant 
and equal to 6. Evidence presented by Card and Krueger (1992), Wachtel (1976), and 
Johnson and Stafford (1973) suggest an elasticity of earnings with respect to education 
expenditures close to 0.2. We assume that ^ is lognormally distributed such that log 
has zero mean and standard deviation 

Our specification of preferences comes from noting that across US states the share 
of personal income devoted to public elementary and secondary education has remained 
roughly constant over the 1970-1990 period. This property will be satisfied if the 
indirect utility function takes the form ^ + £?($(^))^ where $(^) is some fimction of 

^^Introducing a private option into this system greatly complicates the analysis as existence of equi- 
librium is not ensured. See Stiglitz (1974). 

^®See Coleman et al (1966), Hanushek (1986), Card and Krueger (1992), and Heckman, Layne-Farrar, 
and Todd (1996). 

^^See Fernandez and Rogerson (2001a) for evidence. 
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This requires a utility function of the form 



c“ b 

a a 



(2.13) 



with the restriction that 07 = a. 

Under local financing the preferences above imply a constant and identical tax rate 
across individuals, r* = where « = (bA^E{^)) . If a parent’s income in period 

0 is yoy it follows that the child’s income, t/i, is given by logyi = logi4 + 01ogr* + 
01ogyo + log^i- Given 0 < 1, it follows that logyt has a limiting distribution that is 
normal with mean and standard deviation: 



Moo 



log >1 + 0 log T* _ _ 

“ ( 1 - 02 ) 1/2 



(2.14) 



We calibrate the steady state of the local model to match US statistics. We choose 
A and such that and ctoo match the mean and median of the US family income 
distribution, respectively $23,100 and $19,900 in the 1980 census. The remaining 
parameters to be set are b and a, as the value of 0 is already determined by the elasticity 
of earnings with respect to q. 

For any given a, we set b to match the fraction of personal income devoted to public 
elementary and secondary education (in 1980 equal to 4.1 percent), that is, to yield 
a tax rate r* = 0.041. This determines, for a given value of a, a value of b given by 
^ ~ ^ ‘ draw upon two pieces of information. The first is the 

price elasticity of expenditures on education. In our model this can be computed at 
the equilibrium price (in terms of the consumption good), which here has been set to 
one. A survey of the literature by Bergstrom, Rubinfeld and Shapiro (1982) suggests 
an elasticity between —0.5 and —0.25, yielding a between —1 and —2. The second is 
from Fernandez and Rogerson (1999) who model a foundation education finance ^stem 
and use it to match the distribution of spending per student in California prior to the 
Serrano reform. They find an implied value for a equal to —0.2. 

One of the main questions we are interested in asking is whether a local system will 
outperform a state ^stem. Obviously, there is no reason to expect that individuals of 
all income levels will prefer one ^stem over another nor that different generations will 
agree on the relative merits of the two systems. In order to have a measure of aggregate 
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welfare, we use the sum of individual utilities, or equivalently, the expected utility than 
an agent would obtain if she were to receive a random draw from the equihbrium income 
distribution. Thus, we use 

K-t = J UH9H{y)dy (2.15) 

as our measure of aggregate welfare at time t where r — L^S (i.e., local L, state S) 
indicates the education finance regime.^® 

To provide a measure of welfare change at time t that is unaffected by monotone 
transformations of the utility function, we examine the proportion by which the income 
distribution in the steady state of the local regime would have to be changed so that 
it provided the same aggregate welfare as the state ^stem in period t. Given that the 
functional forms adopted are homogeous of degree a in income, this amounts to finding 
the value of At such that (1 H- At)“Vi = V 5 ^ where the local system is evaluated at its 
steady state and the state system in period t. 

If a is negative (as our calibration procedure suggests), then preferred tax rates under 
a state system are increasing in income (under a local system, as noted previously, they 
are independent of income) and only equal to the local tax rate for those individuals with 
income such that yi — //. Since the median voter’s income is lower than mean income, 
it follows that the tax rate will be lower under the state system than the (identical) tax 
rate chosen by each income group under the local system. This implies that in the first 
period, given that the income distribution is the same as in the local system, aggregate 
spending on education will decrease. 

The table below shows the tax rate, mean and median income in the steady state 
of state finance regime. The last two columns report the first period gain in aggregate 
welfare (i.e., prior to the change in mean income) which we denote by Ai and the 
steady-state gain in aggregate welfare, denoted Aoo- Despite the fact that for a strictly 
negative spending on education will decrease in the first period of reform (relative to its 
value in the local system), we find that steady-state mean income is always higher than 
in the local steady state. Furthermore, aggregate welfare increases in period 1 as well 
as in every subsequent period relative to the initial local finance ^stem steady state. 

^®Note that this is equivalent to a utilitarian welfare measure or one chosen “behind the veil of 
ignorance” (i.e., an individual’s welfare if her parents were a random draw from the income distribution 
in that system). 

^^The new steady state is typically reached in five periods. 
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Table 1 



Steady State Comparisons of Local vs State 



a 


TsXlO^ 




median y 


1 +Ai 


1 + Aoo 


0 


4.10 


25, 300 


21,900 


1.006 


1.108 


- 0.2 


4.01 


25, 100 


21,800 


1.007 


1.104 


-0.5 


3.91 


25, 000 


21,600 


1.009 


1.101 


-1 


3.82 


24, 000 


21,500 


1.011 


1.101 


-2 


3.74 


24, 000 


21,400 


1.015 


1.105 



As shown in the Table above, the first period gains are relatively small (around 
1 percent) The steady-state gain is surprisingly constant across parameter values, 
even though the tax rate is changing relative to the local steady state hy as much as 
10 percent."^^ More generally, the “static” welfare gain might well be negative. In a 
model with housing, for example, the unbundling of the education and residence decision 
that a state system allows relative to a local system will in general imply an increase in 
housing prices in relatively poorer communities and a decrease in wealthier ones. Thus, 
lower-income individuals will end up paying higher property prices than previously, and 
the transition to the new steady state may well involve some losses in early periods. In 
the more complicated model studied by Ferndndez and Rogerson (1998), this change 
in housing prices and the fact that agents preferred tax rates differ, implies a small 
decrease (.3 percent) in aggregate welfare in the first period of the policy reform. 

The more complicated analysis in Femdndez and Rogerson (1998) gives rise to an 
even starker illustration of differences in short and long-run welfare. In that paper 
spending on education affects the mean (but not the variance) of the lognormal distri- 
bution from which individual income is assumed to be a random draw. Comparing 
across steady states of a local relative to a state system of financing education, we find 
that, given an individual’s income, each individual prefers a local system to a state 

^^More generally, the “static” welfare gain might well be negative. In a model with housing, for 
example, the unbundling of the education and residence decision that a state system allows relative 
to a local system will in general imply an increase in housing prices in relatively poorer communities 
and a decrease in wealthier ones. Thus, lower-income individuals will end up paying higher property 
prices than previously, and the transition to the new steady state may well involve some losses in early 
periods. In the more complicated model studied by Fernandez and Rogerson (1998), this change in 
housing prices and the fact that agents preferred tax rates differ, implies a small decrease (.3 percent) 
in the first period of the policy reform. 

'‘^See Fern^Lndez and Rogerson (1997a) fora sensitivity analysis for other parameter values. 
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system. However, an individual’s income is of course not the same across systems since 
the probability with which any particular level is realized depends on spending on ed- 
ucation, which in turn depends on the system of financing education. It is taking the 
new distribution of income that results into accoimt that yields a higher steady-state 
welfare level under the state system. 

Next I turn to an analysis based primarily on Benabou (1996). Consider an econo n^ 
populated by OLG dynasties indexed by i who spend some amoirnt of time u working 
and the remainder 1 — i/ passing on education to their single child. The law of motion 
for the evolution of future descendants’ human capital is given by 

(2.16) 

reflecting an inherited portion as given by h\ (the parent’s human capital) and an 
unpredictable portion given by an iid shock The shock is assumed to be distributed 
lognormally such that In^J ~ iV(— s^/2, s^) and thus E{l^l) = 1. Formal schoohng, 
is the other input into the production of next period’s human capital. This is financed 
by taxing at rate r the labor income of local residents. Hence, 

rOO 

El = rYi = T / 2/dmj(2/) (2.17) 

Jo 

where is the distribution of income (and if is its average) in the community A] to 
which family i belongs at time t 

The production sector is made up of competitive firms with constant returns to 
scale CES technology given by cr > 1 where denotes 

intermediate input r. Each worker must specialize in an intermediate input. As there 
are an infinite number of inputs, and each faces a downward sloping demand curve for 
its services, each worker will choose to specialize in a different intermediate input such 
that r(i) = i and supply that input in the quantity x\ = uh\. Thus aggregate output 
simplifies to 

Benabou (1996), individuals choose how much time to spend work relative to educating their 
children so as to maximize the discounted value of future generations log of consumption (the dynastic 
utility function). Given the assumption of log preferences, all individuals choose the same u. They 
also choose a constant value of r. See the Appendix in Benabou (1996) for details. 






(2.18) 



where fi denotes the distribution of hiunan capital in the entire labor force A. Note 
that the complementarity between inputs in the production function implies that a 
worker’s earnings depend both on her own hiunan capital and on an economy-wide in- 
dex of human capital, Ht. That is, yl = This interdependence 

is also reflected in the per capita income of each community as ydrn\{y) — 

where f4{h) is the distribution of hu- 
man capital in the community A\. 

Incorporating the definitions above into the law of motion for the evolution of human 
capital ( 2 . 16 ) yields: 

hU,^K(i{hinLi)f^{Htr (2.19) 

where K — «(1 a = < 5 , /3 = (1 — < 5 )(a — l)/a, and 7 = (1 — 6) /a. Note that 

this function exhibits constant returns to scale, i.e., a -f /3 -f 7 = 1 and that the law of 
motion incorporates a local linkage L\ because education is funded local funds, and 
a global linkage Ht because workers (the inputs) are complementary in production. 

The relative merits of a local versus a state system of education can be studied in this 
framework by comparing the benefits of a system in which individuals are completely 
segregated into homogeneous jurisdictions such that L\ = hi with one in which all 
communities are integrated and hence = Ht, 

Intuitively, the trade-off between the two systems is clear. On the one hand, comple- 
mentarity and symmetry of inputs in production suggests that total output is maximized 
if individuals are homogeneous, pointing towards the benefits of a more homogenizing 
system such as a state-financed one. On the other hand, the fact that parental human 
capital and community resources are complements (i.e., the marginal return to an extra 
dollar spent on formal education is increasing in the level of parental human capital), 
suggests that at a local level assortative grouping of families is beneficial. The relative 
merits of the two systems, as we shall show, depend on the time horizon. 

To analyze the pros and cons of the two i^stems, we need to derive the dynamic path 
of the economy under each education-finance policy. We do this under the assump- 
tion that the initial distribution of human capital at time t is lognormal, i.e., In/ij ~ 
N{rrit^ Af) . The cost of heterogeneity at both the local and global level then can be seen 
in that H = = e-^E{h] < E{h) and D = e-^ E[E\ < E{h>).*^ 

‘‘^Recall that if j/ ~ N{m, A^) and y = Inx, then x ~ lognormal with E{x) = e"*"'' 2 and Var{x) = 
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Noting that InHt = mt {a — 1), the law of motion implied by (2.19) under a 
local finance regime, i.e., hj+i = , implies that the distribution in the 

following period will also be lognormal with 

rrit+i = InK - s^/2 + mt + 'y^^-^^ (2.20) 

A?+i = {a + (3)^Aj+s^ 

Similarly, under a state-finance regime, (2.19) implies = ii'^J(/ij)“(Lt)^+'>'. 
Thus, if the initial distribution of human capital is described by In/iJ ~ N{fht,A^), 
then InLt = mt -h (a - 1 ) and next period’s distribution of human capital is also 
lognormal with 



fht+i = lnK-sy2 + fht + {'y + f3)^^-^^ ( 2 . 21 ) 

Al, = 

(where ^ is used to denote the state-finance regime). 

We examine the imphcations of both regimes on per capita human wealth At — 
/(j°°/id^t(/i). Under a local finance regime, At+\ = K , which, using 

(2.20) implies: 

In^ =\viK- ((a + /?)(l-a+/?) + ^)^ (2.22) 

The first term represents the growth rate of a standard representative agent economy. 
When agents are heterogeneous in terms of their human capital, however, a -I- /? < 1, 
7 < 1, and Jensen’s inequality imply /q°° h°‘'^^dfit{h) < A^^^ and Ht < A] . These dif- 
ferences are reflected in the last term of (2.22) which captures the decrease in growth due 
to heterogeneity as a product of the current variance times a constant term that mea- 
sures the economy’s efficiency loss per unit of dispersion, II = ((a + f3){l — a — (3) 

These losses reflect the concavity of the combined education production function 
and the complementarity 1/a of inputs in production which has weight 7 in the economy- 
wide aggregate 

g2m+2A» _ g2m+A» purtheimore, = e-^ E{x). 

‘*‘*Note that the same reasoning implies that heterogeneity in human capital is a source of gain when 
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Fbr the state-finance ^stem, similar derivations yield 



\n^ = \nK -(a{l-a)+^^^]^ (2.23) 

At V a J 2 

and thus 11= (1 — a) + The interaction of heterogenous agents at the local 

level imposes a loss of iS/a and the concavity of the parental human capital contribution 
to production function (i.e., a < 1) implies losses from heterogeneity along with the 
usual losses stemming as before from the complementarity in production in the economy- 
wide aggregate H. 

The analysis above implies that for given rates of resource and time investment in 
education, r and u, in the short run a state-finance education system will lead to lower 
human capital accumulation than a local system. To see this, note that 

(f>=U-li = =-6{l-6){l-^) <0 

implying that the drag on growth from heterogeneity is greater in a state-financed 
system. That is, two economies that start out with the same distribution of human 
capital in the first period will have a greater level of human capital in the second period 
under a local regime than under a state finance regime. 

In the long-run, however, the conclusion is different. The handicap to growth from 
heterogeneity under a state regime tends to get reduced, as individuals have access to the 
same formal education system, whereas this source of heterogeneity is maintained under 
a local system in which education funding depends on family human capital. Solving 
for the long-run variances of the two systems given the same initial conditions yields: 
A? = + (a + A2 - A^) and A? = A^ + a2‘(A2 - A^) where A^ = 

and A^ = 

Note that we can write ln>lt as lnylo+^ IniiT -■§ ^(A^ — + tA^^and 

similarly In A = In Aq Inii: -■§ ^(A^ - A^)-\^ + tA^) • Hence, taking the limit 
of these expressions as t oo, we obtain that in the case of no uncertainty in which 
initial endowments are the only source of inequality (i.e., = 0), in the long run the 

agents are substitutes in the production function or when the inputs of the community do not consist 
solely of education funds but also, say, peer effects that on aggregate imply increasing returns to scale 
in human capital at the local level. 
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two economies grow at the same rate (namely \nK) and converge to a constant ratio of 
per capita human capital levels, 



= ( n n_\^^ ^ 

^oo \l-{a + f3y 1-a^j 2 2 



where ^ = ( i_(ij ^^)2 ~t 5;3) = (i+^) (V)^ > 0- 

If there is uncertainty in the generation of human capital, then for t sufficiently 
large, 



ln4 « 

At 2 



and the growth rate of the state-finance education ^stem exceeds that of the local 
r^me hy . Hence state-financing raises the long-run levels of human capital ty 
when there is no uncertainty and raises the long rim growth rate of human capital 
by when there is uncertainty. Thus, in the long run a state ^stem always does 
better. Whether a local or state education system is preferable will depend on how 
we discount different generation’s welfare. For a sufficiently patient social planner, the 
state education system will be preferred. 



3. Sorting into Schools 

At some level it is possible simply to repeat much of the analysis of the preceding 
sections but refer to schools rather than neighborhoods. Obviously little additional 
insight would be gained by doing this. A topic which did not have a natural place in 
the previous section is how the possibility of attending a private rather than a public 
school matter. 

Introducing private schooling in a model which includes public schooling is in general 
problematic since in these models the funding of public schools is usually decided by 
majority vote at the local level making it difficult to obtain existence of majority vote 
equilibrium."^^ The problem lies in the fact that those individuals who opt out of public 
schooling prefer (in the absence of externalities) to provide zero funding for private 
schools. 

"^^See, though Eppleand Romano (1996), Femdndez and Rogerson (1995) and Glomm and Ravikumar 
(1998) for some related attempts. 
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E^ple and Romano (1998) provide a model that allows one to study some of the 
interactions between the private and public provision of education in an economy where 
the demand for education depends both on ability and on income. They sidestep the 
problem of funding for education by assuming that the quality of a school depends 
only on the mean ability of its students. Although their theoretical results are some- 
what incomplete given the difficulty of characterizing equilibria in an economy in which 
individuals differ in more than one dimension, their model nonetheless provides an ex- 
tremely useful framework to begin thinking about sorting into schools.^® The rest of 
this section is primarily dedicated to a discussion of their model."^^ 

Consider an economy in which students are assumed to differ in ability b and in 
income y. A school’s quality is determined solely by the mean ability, q, of the student 
body. Student’s care about the quality of the school as their utility depends on their 
achievement a, a function of their own ability b and school quality. They also care about 
private consumption which will equal their income minus the price p they pay for school- 
ing. Public schools are free and financed (so that costs are covered) hy proportional 
income tax rates, t. Letting y^ denote after tax income, individuals maximize: 

V^V{yt-pMQ,b)) (3.1) 

The authors characterize the equilibrium distribution of student types (y, b) across 
public and private schools assuming that types are verifiable. Preferences are assumed 
to be single crossing in income in the {q,p) plane, i.e., (2.2) holds. That implies that, 
for the same ability level, students with higher income will be willing to pay a higher 
price to attend a school with higher mean ability. Preference for quality is also assumed 

to be non-decreasing in ability; that is, ' ^ 0 * 

All schools have the same cost function consisting of a fixed cost and an increasing, 
convex variable cost in the number N of students c{N). Public schools all offer the same 
quality schooling. The number of public schools simply minimizes the cost of operating 
the public sector which is financed by a proportional income tax on all households. 
Private-sector schools, on the other hand, maximize profits and there is free entry and 

'‘^Furthermore, for the interesting case of Cobb-Douglas preferences, their characterization holds as 
will be discussed later. 

'‘^See also Caucutt (forthcoming) for a discussion of how different policies matter when students sort 
(in a complex fashion) across schools by ability and income. 
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exit. 

Private schools maximize profits taking as given the competitive utility V* (y^b) the 
student could obtain elsewhere. Schools can condition prices on ability and income. 
Thus, the profit maximization problem of a private school is to choose prices as a 
function of ability and income and the proportion of each type of student it wishes to 
admit (recognizing that there is a limit to the number of students of each type) taking 
into account the effects that these choices have on school quality and on cost via the 
types and number of students admitted. 

The solution to private school’s j’s maximization problem is characterized by a first 
order condition that, for an interior solution for that student type, equates the effective 
marginal cost of admitting the additional student i of type (6i, yi) to its reservation 
price. Note that when a school admits a student with ability 6,, its quality changes 
by The effective marginal cost of admitting this student is thus the increase in 

cost c/(iV) resulting from the fact that an additional student is being admitted minus 
the change in marginal revenue due to that student’s effect on the school’s quality.^^ 
The reservation price of a particular type of student is given by the maximum price 
pI the school can charge (given its quality) so as to leave the individual at her market 
utility. Note that this implies that some student types will not be admitted since then- 
reservation price is too low to cover their effective marginal cost. 

The equilibrium that emerges from this model has some nice properties.^® As shown 
in Figure 2, there will be a strict hierarchy of school qualities Qn > Qn-i > ••• > QOy with 
the public sector (denoted by j = 0) having the lowest-ability peer group. Define 
the boundary loci between two schools as the set of types who are indifferent between 
the two schools (a curve with zero measure). Students who are on the boundary loci 
between two private schools will be charged their effective marginal costs; all other 
students will be charged strictly more than their effective marginal costs. This follows 
from the fact that students on the boundary are indifferent between attending either 
of the two schools competing for them, which drives down the price that each school 
can charge to that ability type’s effective marginal cost. Furthermore, since a type’s 

'*®Thus the effective marginal cost can be negative fora relatively high-ability student leading to the 
possiblity of negative prices (e.g. fellowships) in equilibrium. 

^®”Epple, Newlon, and Romano (forthcoming) adapt this model to study ability tracking (or stream- 
ing) in public and private schools. Epple and Romano (1999) use a modified version of the model to 
study voucher design. 
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eflFective marginal cost is independent of income, their price will only depend on their 
ability. For students within the boimdary loci, on the other hand, the fact that they 
are not indifferent over which school they attend leaves the school with some monop>oly 
power which the school exploits by increasing the price. Hence, in general, the price 
charged to students within a school’s boimdary loci will depend both on ability and 
income. Note though that competition and free entry among schools implies that a 
school’s profit is equal to zero.^ 

Lastly, it is also possible to characterize the type of students that will attend each 
school in equilibrium. The single-crossing condition in income ensures that if an indi- 
vidual with income yi attends a school with quality qj^ then all individuals with the 
same ability but greater income will attend schools of at least that level of quality and 
all individuals with lower income will attend schools with no greater quality.^ ^ 

Thus, this model yields stratification by income. Stratification by ability need not 
follow, although the authors are able to find conditions (unfortimately on equihbrium 
variables) such that schools will also be stratified by ability.^^ Note that, as public 
schools have the lowest quality level, they will be composed of low-income individuals. 
If stratification by quality also holds, then public schools will consist of the lowest income 
and lowest ability students. 

To imderstand the normative implications of the model, first suppose that no public 
option exists. Given the number of private schools, the allocation of types into schools 
is Pareto efficient. This is because private schools internalize the ability externality 
in their choices and there is perfect price discriminate over income income. The equi- 
librium number of school is not generally efficient, however, because the finite size of 
schools implies entry externalities. Furthermore, public sector schooling in this model 
in general implies Pareto inefficiency even given the equilibrium number of schools. Zero 
pricing by public schools independently of ability implies that the allocation of types 

usual with model with fixed costs, free entry does not imply zero profits due to the integer 
problem. We will ignore that qualification here. 

^^This property of equilibrium does not follow immediately from single-crossing since schools can 
discriminate by types and thus a higher quality school may charge an individual with higher income a 
higher price. This behavior, however, will not disrupt income stratification because effective marginal 
cost depends only on ability and schools are sure to attract all types willing to paymore than effective 
marginal cost. 

®^In their working paper (1993), Epple and Romano show that for a Cobb-Douglas specification of 
utilty U = (yt — the equilibrium yields stratification ability. 
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among public and private sector schools is inefficient.^^ 

A very different issue in sorting into schools is studied by Ferndndez and Gali (1999) . 
This paper is primarily interested in the properties of different assignment mechanisms 
under borrowing constraints. They examine a perfectly competitive model in which 
schools that vary in their (exogenous) quality each charge a market-clearing price to 
agents who vary in their ability and income. Schools have a fixed capacity and agents 
are assumed to be unable to borrow. In this model, the assumption that abihty a and 
school quality q are complements in the production of output x (a, q) implies that a 
social planner (or perfect capital markets) would assign the highest ability student to 
the highest quality school, the next highest ability student to the next highest quality 
school and so forth. A perfectly competitive pricing mechanism does not produce this 
outcome. Instead, lower ability but higher income individuals are able to outbid higher 
ability but lower income agents for a place in a high quality school. 

This equihbrium outcome above is contrasted with an exam mechanism that assigns 
students to school based on their performance on the exam. The exam score is assumed 
to be an increasing function of expenditures on education (e.g., better preparation, tu- 
tors, etc.) and innate ability. The exam technology is such that the marginal increment 
in expenditure required to increase a given score is decreasing in ability. 

The authors find that an exam mechanism will always produce greater output. How- 
ever, as expenditures under an exam system are wasteful, aggregate consumption need 
not be higher. The authors show, nonetheless, that for a sufficiently powerful exam 
technology (one that is sufficiently sensitive to ability relative to expenditures), the exam 
mechanism will always dominate the market mechanism for both aggregate production 
and consumption. 

4. Household Sorting 

People sort not only into neighborhoods and schools, they also at the household level by 
deciding whom to ‘inarry” or more generally who to match with. Although there is a 
small literature that analyzes the economics of matching (e.g. Becker (1973) and Burdett 
and Coles (1997)), there has been very little analysis, empirical or theoretical, of how 
^^See Epple and Rx)mano (1998) fora foller discussion. 
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this interacts with other general equilibrium variables such as growth and inequality.^^ 
What are the consequences of household sorting for the transmission of education 
and inequality? Following Femdndez and Rogerson (2001b), I will set down a rudimen- 
tary model that allows us to examine this issue. This model will leave exogenous several 
important features of the decision problem (such as who to match with and fertility), 
but it will simphfy the analysis of key features of the transmission process.^ 

Consider an OLG model with two types of individuals-skilled (s) and unskilled (u)- 
in which the level of skill is also synonymous with the level of education (college and 
non-college respectively). These individuals meet, match, have children, and decide how 
much education to give each of their children. 

Given a population at time t whose number is given by Nt and some division of that 
population into skilled workers, TV^t, and unskilled workers, Nuu where Nt = Nst + Nut, 
let (3 denote the fraction of the population that is skilled, i. Rather than 

endogenize matches, we assume an exogenous matching process in which a fraction 6 of 
the population matches with probability one with someone of the same type, whereas 
the remainder match at random. As there are two types of individuals, this gives rise 
to three types of household matches indexed by j which we shall denote by high (h) 
when it is between two skilled, middle (m) when the match is between a skilled and an 
unskilled, and low (Z) between two unskilled. 

The matching process specified above yields Xht = + (1 — 0)01 as the fraction 

of matches that are high, = 2(1 - 0)^t(l - PP as the fraction of matches that are 
middle, and Xu = 6{\ — Pt) + (1 — 0)(1 — P0^ as the fraction that are low. Of course, 
Xht + Xmt + A/t = 1. Note that 0 equals the correlation of partners’ education levels. 
Families have n = {0, l,...n} children. We allow the probability with which 

they have a particular number n to depend on the family type, so that average fertility 

n 

/ for a family of type j is given by fj = ^ n(j)^A. 

n=0 

Children are either “college material” (whereupon if they went to college they would 
become a skilled worker) or they are not and sending them to college would still produce 
an unskilled worker. We denote these types as either high or low “aptitude” and allow 

^'^Some exceptions are Cole, MaUath, and Postlewaite (1992) and Kremer (1997). 

®^See Ferndndez, Guner and Knowles (2000) and Fernandez and Pissarides (2000) for models that 
endogenize several of these features. 
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the probability 7 ^ that a child is of high aptitude to depend on her parental type.^® If 
a high aptitude child is sent to college, she earns the skilled wage, Ws] otherwise she 
earns the unskilled wage Wu- 

Lastly, we come to the education decision. We assume that the cost of college is 
given 1 / > 0 . Capital and insurance markets are imperfect in that parents cannot 
borrow to finance the college education of their children but must finance it from their 
earnings. Insurance (as to which type of child a family might have) is also assumed not 
to be available. The assumption of not being able to borrow for a college education is 
not necessarily meant to be taken literally. Rather we have in mind the local primary 
and secondary education system described earlier whereby education is financed to a 
large extent at the local level and minimum lot sizes (or higher borrowing costs), for 
example, constrain the quality of education that less wealthy parents are able to give 
to their children. 

Parents choose per family member consumption level c and the number, r, of their 
high-aptitude children to educate so as to maximize the utility fimction below: 

U = l (4.1) 

I (c - ^ otherwise 

implying that subject to a minimum per family member consumption level of c, parents 
will send a high-ability child to college if they can afford to (and it is economically 
advantageous to do so). The family budget constraint is given by (2-|-n)c-|-rz>' < ij(/3), 
0 < r < a, where a is the total number of high aptitude children the family has, and 






2ws{(3) 

Ws(J3) -h WuiP) 
2wu{P) 



for j = h 
for j = m 
for j = l 



(4.2) 



Lastly, wages are determined in a competitive market as the appropriate maiginal 

^®One should consider aptitude to reflect family background in the sense of making it more probable 
that a child will obtain a college education. It should be noted that this is not really standing in for a 
genetically determined process since in that case we would have to keep track of whether a particular 
match consisted of 0, 1, or 2 high- aptitude individuals. 
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revenue products of a constant returns to scale aggregate production function given ty: 



F{Ns,Nu) = NuF{Ns/Nu,l) = = NufW) (4.3) 

/ > 0, f" <0 

Hence, wages are solely a function of (i and given by (/3) = ( 1 — 0)^ f (/3) and — 

f{P) — P{1 — /?)/'(/?). Note that (4.3) implies that skilled wages are decreasing in the 
ratio of skilled to unskilled workers whereas unskilled wages are increasing. Also note 
that no family would want to send their child to college if the fraction of skilled workers 
exceeds /?, where /? is defined Ws{P) — Wu{P) + 

To solve for the steady states we need one additional piece of information, Tj{zj (P))^ 
the average proportion of children sent to college by families of type j. This will depend 
on how constrained each family is (this may differ according to family size and how many 
high aptitude children they have), Znj^ which in turn depends on wages and hence on 
p. Hence, 









nj 



(4.4) 



where the first summation term within the square brackets is the number of children 
that attend college from families of type j with n children that are not constrained (as 
the mnnber of high-aptitude kids they have is fewer than Znj) and the second summation 
is over the number of children that attend college from constrained families of type j 
with n children.^^ 

The steady states of the econorriy are the fixed points of the dynamic system below: 



/3t+i(0) = 



,, Eri(.^i(/3t))/iAit(/3t;0) 

Nst+i j 



Nt+i 






(4.5) 



i.e., a level of /3, such that = Pt+i- We restrict our attention to those that are 

a feunUy of type j with n children is not constrained, we simply indicate this by Znj = n. 
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locally stable, i.e., \ <1. 

Note that in general there may be multiple steady states. To see why this is so, 
consider what may happen if we start out with a low level of /?. In this case, low type 
families (and perhaps middle types as well) will be relatively constrained since unskilled 
wages are low. Thus, this will tend to perpetuate a situation in which /? is low next 
period as well and thus a steady state with a low proportion of unskilled wages (and 
high inequality). If, on the other hand, the economy started out with a high level 
of /?, unskilled wages would be high and hence low type families would be relatively 
unconstrained, perpetuating a situation of high /? (and low inequality). 

How does the degree of sorting affect this economy? If the change in sorting is 
sufficiently small that the degree to which constraints are binding is unaffected (i.e., the 
Fj’s are constant), then 

Pii - MMTh -P)- 2UTm -P) + MTi - M 

de D ^ ’ 

3(1 - P)\ifhrh - 2fmTm + m) - 0ifh - 2/m + fl)] 

D 

where D — &) + ~ show that local stability 

i 3 

requires Z? > 0. 

The expression in (4.6) is easy to sign for a few cases. Suppose all the Ej’s are 
the same, i.e., Fj = F. In that case, = F and the extent of sorting does not affect 
the personal income distribution (though it does the household income distribution) as 
wages are unchanged.^® 

Suppose next that average fertility is the same across all groups, i.e., fj — f. In 
this case, the sign of (4.6) is given by the sign of F/j + F^ — 2F^. The intuition behind 
this is simple. Note that the effect of an increase in sorting is to destroy middle-type 
matches and replace these by high and low ones. In particular, for every two middle 
matches destroyed, one high and one low match are created. Since average fertility is 
the same across family types, the effect of increased sorting depends on whether the 
fraction of children sent to college on average by two middle-type marriages (2Fm) is 



Recall that we are assuming that constraints are unaffected by the change in sorting. 



smaller than the combined fraction of children that go to college on average in one high 
and one low type family (Th + Ti), Thus, if the relationship between parents’ education 
and children’s education is linear, changes in sorting will have no effect on if concave, 
increased sorting will decrease and the reverse if the relationship is convex. 

Lastly, making no assumptions about fertility or the Fj’s, a sufficient condition for 
an increase in sorting to decrease P is — 2f^F^ + fiTi < 0 and fh + fi ^ ^ 0 

(with at least one inequality strict). The first expression is the counterpart of the 
expression in the preceding paragraph. That is, subject to no change in population 
growth it ensures that there will be fewer skilled individuals in the following period. 
The second expression ensures that the population growth rate will not decline as a 
result of the increased sorting (thereby potentially giving rise to a larger proportion of 
skilled people despite the fall in their growth rate).^^ 

The above discussion assumed that the Fj’s remained invariant to the change in 
sorting. Note, however, that these may well change as constraints become more or less 
binding as a result of the change in wages.® ^ Hence, even if fertility is exogenous, the 
sign of fhTh — 2fmFm + is in general endogenous since the Fj’s are endogenous 
variables.®^ Thus, whether the expression is concave or convex may itself depend on p. 

Fernandez and Rogerson (2001b) explore the effect of increased sorting on inequality 
by calibrating the model above to US data. They use the PSID to obtain a sample 
of parents and children and group all individuals with high school and below into the 
unskilled category and everyone who has had at least some college into skilled. The 
correlation of parental education (0) equals .6. Average fertility is given by fh = 1.84, 
fm = 1.90, and fi = 2.26 (from PSID and Mare (1997)). For any average fertility 
number, the two integers that bracket the average are chosen as the only two possible 
number of children to have, with the appropriate weights used as the probabilities (e.g., 
(j)ih = .16, (^2/i = -84). 

To calibrate the model, we need to know the 7 j’s. These are not available in the 
data but what is computable from the PSID are the Fj’s (i.e., the fraction of children 
®®The opposite signs on the two expressions is a sufficient condition for increased sorting to increase 

®®In a more general model where household incomes were continuous, then a change in 0 that sdfected 
for a constant set of F’s, would also necessarily affect the Fj ’s. 

®^In a more complex model in which fertOity and/or matching are endogenized, then one can perform 
a similar exercise by changing technology such that the skOl premium for any /3 is higher or by changing 
the cost of search. 



of each family type that on average attend college). These are given by Th = .81, 
= .63, and Ti = .30. Note from (4.4) that any value of Fj can be decomposed 
into an assumption about how ‘‘inheritable” education is (the 7 j’s) and a corresponding 
assumption about how binding borrowing constraints are (the Znj^s). The table below 
shows various such decompositions for F/ (for the other Fj’s it is assumed that the 
constraints are not binding and hence Tj = 7 ^). 

Table 2 

Aptitude Profiles Under Various Scenarios Znrn. = n, Znh. = n 



Znl =n Znl = 2 = 2, = 1 = 1 



.81 


.81 


.81 


.81 


.63 


.63 


.63 


.63 


.30 


.303 


.334 


.401 



Ferndndez and Rogerson (2001b) use the second column as their benchmark. Note 
that this implies the existence of very mild constraints. Only low-type families with 
three high-ability children are affected and these are fewer than 1 percent of low- type 
families. 

This information along with the Fj’s allows us to compute the steady state, yielding 
/3 = .60. To obtain wages, we use a CES production function y = A[bNs + (1 - 
6 )N£]^and match the steady-state ratio of skill to unskilled wages to 1.9 (Katz and 
Murphy (1992)) and obtain p = .33 by matching an elasticity of substitution between 
skilled and unskilled workers of 1.5 (see survey by Katz and Autor (1999)). Lastly, 
for ease of interpretation of our results, we choose a value of A to scale steady-state 
unskilled wages to some “reasonable” value, which we set to be 30, 000. This is purely 
a normalization. 

It is important to note that the steady-state of the calibrated model fulfills the 
sufficient conditions such that an increased 6 leads to a lower proportion of skilled 
individuals. Hence, from a theoretical perspective, we know that an increase in sorting 
will lead to higher skilled waged and lower unskilled ones. The quantitative impact 
is given in the table below. The first row reports mean years of education (in which 
the skilled group and unskilled group have been assigned the mean from their PSID 
sample). The second row gives the coefficient of variation of education. The last 



entry is the standard deviation in log income-our measure of inequality in the personal 
income distribution. 

The first column of the table reports the result of the calibration.^^ The second 
column reports the effect of an increase in sorting to .7 assuming that the values of 
r are unchanged. The third column does the same but assumes that the decrease in 
the unskilled wage means constraints are tightened for low-type families and that those 
with three children can only afford to send a maximum of one of them to college. 

Table 3 



Effects of Increased Sorting on Steady State 





.6 

Ti - .30 


II 

II 


.7 

Ti - .27 


mean{e) 


13.52 


13.48 


13.40 


cv{e) 


.134 


.135 


.137 


d 


.600 


.589 


.568 


Wa/Wu 


1.900 


1.95 


2.07 


stdilog y) 


.315 


.330 


.361 



The main message of the table above is that changes in sorting can have large effects 
on inequality and that seemingly small changes in average years of education or in its 
coefficient of variation can underlie large changes in the income distribution. As shown 
in the table, a change in sorting from .6 to .7 will increase the standard deviation of 
log income by a bit under 5 percent in the absence of any assumption about borrowing 
constraints.^^ If as a result of the approximately $600 drop in that results (and 
consequently a $1200 drop in low-type family income) constraints tighten, this leads to 
an increase in inequaUty of almost 15 percent. In both cases, the effect on the standard 
deviation of log family income is large: 8.3 percent and 19 percent respectively.^ 

^^Note that the standard deviation of log income is about half of what it is in reality for the US. It 
is not surprising that our model is not able to produce as much variation as in the data as there are 
only two wages. 

®^Note that these results, therefore, are independent of which column we choose from Table ?? as our 
benchmark. 

®‘*The results for the 9 increase to .8 follow a pattern similar to the one above. The change in the 
mean and standard deviation of the education distribution are small, os before but the change in income 
distribution are large. 
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The analysis above also points out the dangers with assuming intergenerational 
processes are linear. Kremer (1997), for example, assumes that the years of education 
a child acquires is a linear function of average parental years of education as given by: 



ei,t+i = « + a — ^ ^ -h 



(4.7) 



where is the education level for the child, ei,t and ei,^t are the education levels of 
the two parents, and ^ is a normally distributed random shock that is iid across families, 
with mean 0 and standard deviation equal to cr^. Parents are all assumed to have two 
kids and an (exogenous) assort at ive matching of individuals takes place yielding 6 as 
the correlation between the education levels of parents. 

Note that within the framework of Fern^dezand Rogerson (2001b), the assumptions 
of a linear transmission process and the same fertility across all parent types would yield 
no effect of an increase in sorting on inequality In Kremer’s model, this is not the case 
as although the mean of the distribution is imaffected, the inclusion of a shock implies 
that greater sorting will increase inequality. To see this, note that with constant 
parameter values the distribution of education converges to a normal distribution with 
steady state mean and standard deviation given by and 



~ [l-a2(l+0)/2]-5 

respectively. Thus an increase in 6 while not affecting the mean, increases the variance 
of the distribution of education. 

To investigate the effects of sorting within this model, Kremer uses PSID data to nm 
the regression suggested by (4.7), and finds a equals .4. Parents’ correlation in years 
of education, as we saw previously is .6. This implies, using (4.8), that even a large 
increase in the correlation of parental education, say from .6 to .8 will only increase the 
standard deviation of the distribution of education by about 1 percent. Furthermore, 
if we assume as Kremer does that log earnings are linear in years of education (i.e., 
= a -h 6ei,t-|-i), then exactly the same conclusion apphes to the distribution of 
earnings. 

The very different conclusions obtained by Kremer relative to Fern^dez and Roger- 
son emphasize that importance of certain features of the data (i.e. fertility differentials 



41 




43 



and non-convexities in the transmission process) as well as the endogeneity of wages. 
Furthermore, as shown in Fern^dez and Rogerson, borrowing constraints can greatly 
multiply the magnitude of any effect of increased sorting that takes the shape of the 
transmission process as given, rather than endogenous. 

In light of the above, it is of interest to ask how inequality, fertility and sorting 
are related in a model in which these variables are endogenous. Fern^dez, Guner 
and Knowles (2000) develop a simple two-period search model in which individuals are 
given multiple opportunities to match with others. As before, there are two types of 
individuals (skilled and unskilled) distinguished only by their educational attainment. 
In the first period we assume that agents meet others from the population in general. 
In the second period agents meet only others who are similar to themselves in terms of 
skill level.^^ Agents characteristics (income) are fully observable, as is the quality of 
the match. The latter is assumed to be a random draw from a quality distribution, 
and is fully match specific. If agents decide to keep their first period match, they are 
unable to search in the second period. 

Having matched, individuals decide how many children to have (at a cost per child 
t that is proportional to income I) and devote the rest of their income to consumption. 
Thus individuals maximize: 



max [c + 7 log(n) + K + q] , (4.9) 

c,n 

subject to 

c < 7(1 — tn), t > 0 , 

where n is the number of children, 7 > 7 is household income, q is the quality of the 
match and 7T is a constant. Plugging in the optimal decisions for an individual (and 
choosing K such that the sum of the constants is zero) allows us to express the indirect 
utility function asV{I,q) = I — ylogi + q^ 

Assuming a constant returns production function allows us as before to express 
wages solely as a function of the ratio of skilled to unskilled workers and to express 
household income as in (4.2). The cutoff match quality that a high wage worker will 
accept in order to match with a low wage individual in the first period is an increasing 



®®One could just as easily simply assume that the first period one meets a more representative sample 
of the population relative to the second period in which it is biased towards individuals who are similar. 



function of Ws and a decreasing function of 

Children face two costs to becoming a skilled worker. First, there is a constant 
monetary cost of d. Second, there is an individual-specific (additive) psychic cost 
(e.g., effort) of 6 with a cumulative distribution The return to being a skilled 

worker is the probability of matching with a skilled worker and obtaining household 
income Igs (in which wages are assumed to be net borrowing and repaying d) plus the 
probability of matching with an unskilled worker and obtaining household income Isu- 
These probabilities depend on the probability that in the first period a particular type 
of worker is met and on the cutoff quality of the match a skilled worker will accept (and 
hence on the fraction of individuals that are skilled in the population, i.e., /?). A similar 
calculation holds tor the return to being an unskilled worker.®^ 

If there were no borrowing constraints, then all families would have the same fraction 
of children become skilled so that the net return to being a skilled worker equalled that 
the return to being an unskilled worker plus 6*{P) (the eqmlibrium p^chic cost such 
that no worker with 6i > 6* {(3) is willing to become skilled). If, however, there are 
borrowing constraints such that the amount that an individual can borrow depends 
(positively) on family income, then families with higher household income will have a 
higher fraction of their children become skilled. 

How does inequality matter? It is easy to show that as family income increases, 
fertility declines. Thus fertility differentials are increasing with inequality. Further- 
more, as wage inequality increases, skilled workers become pickier about the quality of 
the match required to make them willing to match with an unskilled worker. 

As before, this model will in general have multiple steady states. If the economy 
starts out with a low proportion of skilled workers, the skill premium will be high, 
skilled workers will be very picky about matching with unskilled workers, and hence 
there will be a high level of sorting. Given borrowing constraints, only a small fraction 
of children from low-income households will become skilled implying that in the next 
period a similar situation will tend to perpetuate itself-a high level of inequality, high 
sorting, and high fertility differentials. The opposite would be true if instead the 
economy starts out with a high level of skilled workers. In this case inequality is low, 

®®The skilled worker will always be the one whose cutoff quality level is binding as her income is 
greater. 

®^Note that unlike Femdndez and Rogerson (2001b), the return to being skilled/unskilled depends 
also on how this decision affects the type of match one will obtain. 
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high-skilled agents choose a low cutoff quality for matching with unskilled agents so 
sorting is low, fertility differentials are low, and borrowing constraints are not very 
binding. This leads again to a high proportion of skilled workers the following period. 

We take the implications of this model to the data. Using a sample of thirty three 
coimtries we examine the relationship between sorting and inequality and find that, as 
the theory predicts, these are positively correlated. Countries with greater inequality 
exhibit greater sorting at the household level. Furthermore, as also predicted by the 
theory, fertility differentials are increasing in inequality.^ 

5. Concluding Remarks 

This chapter has reviewed some of the principal contributions to the literature that 
examines the links between sorting, education and inequality. Much work remains to 
be done in all of the areas discussed in this chapter: education finance systems and 
residential sorting, schools, and household sorting. In particular, it would be of interest 
to see more work that examined how different education systems matter, and provided 
an empirical basis on which to assess different policy proposals. At the school level, very 
little is known about how parents, teachers, students, administrators and the community 
interact in producing schooling of a particular quality. I think that the largest challenge 
here is the creation of a convincing multiple principal-agent model that endogenizes the 
quality of the school in response to information constraints, the availability of alternative 
options, and the system in which it is embedded. In addition, it would be of interest to 
study the incentive effects of external standards (e.g. national or state level exams) that 
allow schools to be “graded” against one another. Finally, work on household sorting 
is still at an embryonic level, both theoretically and empirically.®^ A notable omission 
form the models discussed above is the role of gender: they do not distinguish between 
the education and income distributions of men and women. It would be of interest to 
examine how these matter and to investigate, empirically and theoretically, the role of 
woman’s large increase in labor force participation and educational attainment. 



®®Kremer and Chen (1999) examine the relationship between fertility and inequality for a large sample 
of countries and find that fertility differentials and inequality are positively correlated. 

®®See Greenwood, Guner and Knowles (1999) for recent work in this field. 
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